Sensitivity and Specificity of Examination Maneuvers for Carpal Tunnel Syndrome: A Meta-Analysis

Our purpose was to assess the diagnostic validity (sensitivity (Sn) and specificity (Sp)) of physical examination maneuvers for carpal tunnel syndrome (CTS). This meta-analysis utilized the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) checklist. Studies assessing exam maneuvers (including components of the CTS-6) for CTS were identified in MEDLINE (Medical Literature Analysis and Retrieval System Online) and Embase (Excerpta Medica Database) databases. Assessed maneuvers assessed included: Phalen's test, Tinel's sign, Durkan test, scratch-collapse test, Semmes-Weinstein monofilament (SWM), and static 2-point discrimination (2PD) test. Data extracted included: article name, total number of subjects/hands, type of exam, and exam Sn/Sp. Forest plots were presented to display the estimated Sn/Sp and boxplots were used to demonstrate the locality, spread, and skewness of the Sn/Sp through the quartiles. After screening 570 articles, 67 articles involving 8924 hands were included. Forty-eight articles assessed Phalen's test, 45 assessed Tinel's sign, 21 assessed the Durkan test, seven assessed the scratch-collapse test, 11 assessed SWM, and six assessed the static 2PD test. Phalen's test demonstrated the greatest median Sn (0.70, (Q1, Q3): (0.51, 0.85)), followed by the Durkan test (0.67, (Q1, Q3): (0.46, 0.82)). 2PD demonstrated the highest median Sp (0.90, (Q1, Q3): (0.88, 0.90)), followed by SWM (0.85, (Q1, Q3): (0.51, 0.89)). There is considerable variability with respect to the validity of physical exam tests used in the diagnosis of CTS. Upper-extremity surgeons should be aware of inherent limitations for individual exam maneuvers. In the absence of a uniformly accepted diagnostic gold standard, a combination of exams, along with pertinent patient history, should guide the diagnosis of CTS.


Introduction And Background
Carpal tunnel syndrome (CTS) is frequently encountered in primary care and subspecialty clinics and remains the most common form of peripheral compressive neuropathy in the upper extremity [1][2][3]. A combination of an accurate patient history and a focused physical examination is required for the diagnosis of this clinical condition. The American Association of Orthopaedic Surgeons (AAOS) recently modified the clinical practice guidelines (CPGs) for CTS where electrodiagnostic studies (EDS) are no longer required for cases without diagnostic uncertainty [4]. However, the routine utilization of EDS in uncomplicated CTS remains controversial. Previous reports have indicated that 26% of hand surgeons require EDS prior to consultations relating to CTS [5]. The use of EDS has been shown to increase healthcare costs and does not change the likelihood of diagnosing CTS in cases without diagnostic uncertainty [6,7].
For this reason, the CTS-6 was developed to aid in standardizing the diagnostic criteria [7]. Using a combination of simple history and physical exam items, the CTS-6 was intended to be used by non-expert clinicians who frequently encounter hand numbness as a chief complaint [7]. Since its publication, the CTS-6 has been used as a reference standard to compare other diagnostic modalities and has demonstrated substantial levels of inter-rater reliability for the examination components [8]. The physical exam components of the CTS-6 include the presence of thenar atrophy/weakness, Phalen's test, static two-point discrimination (2PD) test, and a Tinel's sign [7]. Scoring within the CTS-6 is weighted by examination performance and was designed to be used in aggregate, as no single examination finding, in isolation, is diagnostic of CTS [9].
Although the CTS-6 has been established as a practical diagnostic tool, other physical exam maneuvers that are not included in the CTS-6 are often still utilized by physicians as part of the evaluation of peripheral compressive neuropathy. Carpal tunnel compression was described by Durkan in 1991 with an Sn/Sp of 0.87 and 0.90, respectively [10]. Subsequent publications have demonstrated wide variations in the Sn/Sp for the Durkan test with different examiners and methodologies [10][11][12][13][14]. Similarly, there has been interest in the scratch-collapse test for the diagnosis of CTS, with the initial series demonstrating a Sn of 0.64 [15]. However, subsequent blinded follow-up studies have reported a Sn of 0.24 and a Sp of 0.60, again demonstrating wide variation in exam validity [15][16][17]. Variability in the validity of these physical exam maneuvers introduces potential uncertainties in the diagnostic work-up of CTS. Considering the variability in examination performance characteristics for individual studies, aggregate data may more accurately reflect individual examination validity.
The purpose of this meta-analysis was to analyze the validity (Sn/Sp) of common physical examination tests and maneuvers utilized in the evaluation and diagnosis of CTS. We hypothesized that the reported Sn/Sp values for CTS physical examination maneuvers would be variable throughout the existing literature.

Materials and methods
This review follows the transparent reporting guidelines outlined in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) checklist.

Literature Search
On

Study Selection and Data Extraction
Titles and citations were screened for duplicates. The remaining titles and abstracts were screened for eligibility independently by two authors (YO, DSH) with any discrepancy being resolved through discussion and consultation with a third author. Eligible full-text articles were assessed against the inclusion and exclusion criteria and data extracted. Data were extracted from studies including article name, total number of subjects/hands, type of exam, and examination diagnostic validity (Sn/Sp).

Statistical Methods and Analysis
The overall relationship between Sn/Sp was displayed using scatterplots. For each individual examination, forest plots were presented to display the estimated results of Sn/Sp from the included studies. Boxplots were used to graphically demonstrate the locality, spread, and skewness of the Sn/Sp through the quartiles. Median and quartiles (Q1, Q3) were summarized for Sn/Sp, respectively. Statistical analyses were performed using RStudio (Posit Software, PBC, Boston, Massachusetts, United States). Figure 1 shows a flowchart of article inclusion through the study period. After initially screening 570 articles, 67 articles involving a total of 8722 patients (8924 hands) were finally included in the review.

PRISMA: Preferred Reporting Items for Systematic Reviews and Meta-Analyses
Basic demographic information of patients from the included articles and their bibliographic data are presented in Table 1 Table 2.

Specificity
The Sp of a test measures the ability to correctly identify non-CTS cases (also known as true negative rate).  Table 2.  Table 3 contains a description of the study characteristics stratified by examination.    [18]. However, other studies have reported the contrary, with Phalen's test having higher Sn/Sp compared to Tinel's sign [54,55]. Differences in reported Sn/Sp in individual exams within the CTS-6 are likely attributable to the study heterogeneity of included articles. Differences with respect to the prevalence of CTS, age, reference standard, and study methodology as well as cultural differences in patient populations, variable presentations, and the difference in compression may all contribute to the observed performance variability [40]. In this context, aggregate data may be more generalizable and more accurately reflect overall exam validity.
Phalen's test and 2PD had the highest Sn/Sp respectively in our meta-analysis and both studies are heavily weighted components of the CTS-6. With higher Sp, both Phalen's test and 2PD bolster the probability of the diagnosis of CTS when they are positive. Overall, points allocated within the CTS-6 are supported by our results. Phalen's test had the highest combined Sn/Sp across the literature and is allocated 5 points within the CTS-6, which is the highest for any single item. This compares to the 4.5 and 4 points allocated for 2PD and Tinel's sign, respectively, which both have high Sp (≥0.80), but lower Sn relative to Phalen's test.
Other examination maneuvers that are not used as part of the CTS-6 were additionally assessed in this investigation. When initially described, the scratch-collapse test was reported to have Sn/Sp of 0.64/0.69 for diagnosing CTS [15]. Blinded follow-up studies after this initial publication have described lower Sn/Sp values of 0.20/0.60 for the test [15,16]. Further studies on the validity of the scratch-collapse test have also reported similar outcomes. Cebron and Curtin reported a range of 0.24-0.77 for the Sn of the scratch collapse test for diagnosing CTS [56].
Areson et al. redemonstrated the low Sn/Sp of 0.48/0.59 for the scratch-collapse test and advised against the use of the test for clinical decision-making in the setting of CTS [57]. Overall, we found the reported Sn varied substantially between papers, ranging from 0.05 to 0.64. Again, this variability is likely due to heterogeneous methodologies implemented across the included studies. Considering the scratch-collapse test's median Sn of 0.34, our meta-analysis is consistent with recent literature regarding the low validity of the scratch-collapse test. In this context, aggregate analyses, such as systematic reviews or meta-analyses, may provide a more balanced assessment of examination performance and test validity.
The variable performance of physical examination maneuvers noted in our series is also reflected in prior publications assessing imaging and advanced studies utilized in the work-up of CTS. Fowler et al., in their meta-analysis, showed that the Sn for diagnostic ultrasound ranged between 0.57 and 0.98 and the Sp ranged from 0.63 to 1.00 [58]. Landau et al. reviewed the use of power Doppler ultrasound and found that Sn/Sp for diagnosing CTS showed high levels of variability, reporting a range of 0.02-0.93 for Sn and an Sp of 0.89-1.00 [59]. The lack of an accepted diagnostic gold standard or reference standard for CTS likely contributes to this diagnostic uncertainty and serves as a caution against using individual tests in isolation to make the diagnosis.
This meta-analysis has a number of limitations, including the heterogeneity of the study types included. We did not limit our inclusion criteria by study type, including RCTs, case series, and case-control studies. Not all studies that were analyzed used the same reference standard to establish the diagnosis of CTS. The majority of studies in this meta-analysis used EDS as their reference standard, while the remaining studies used a variety of different reference standards such as clinical examinations (CTS-6), imaging studies (sonography), or a combination of these diagnostic techniques. Several different patient populations such as military members, dental workers, sex-specific studies, patients with rheumatoid arthritis, and hospitalized patients have been included in the studies assessed in our meta-analysis [23,27,34,60]. These heterogeneous populations can introduce potential confounding by the presence of selection and Berksonian biases [23,27,34,39,60].

Conflicts of interest:
In compliance with the ICMJE uniform disclosure form, all authors declare the following: Payment/services info: All authors have declared that no financial support was received from any organization for the submitted work. Financial relationships: All authors have declared that they have no financial relationships at present or within the previous three years with any organizations that might have an interest in the submitted work. Other relationships: All authors have declared that there are no other relationships or activities that could appear to have influenced the submitted work.